Search CORE

305 research outputs found

Sequential Symbolic Regression with Genetic Programming

Author: D White
GY Lee
J Demšar
JA Walker
L Vanneschi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

This chapter describes the Sequential Symbolic Regression (SSR) method, a new strategy for function approximation in symbolic regression. The SSR method is inspired by the sequential covering strategy from machine learning, but instead of sequentially reducing the size of the problem being solved, it sequentially transforms the original problem into potentially simpler problems. This transformation is performed according to the semantic distances between the desired and obtained outputs and a geometric semantic operator. The rationale behind SSR is that, after generating a suboptimal function f via symbolic regression, the output errors can be approximated by another function in a subsequent iteration. The method was tested in eight polynomial functions, and compared with canonical genetic programming (GP) and geometric semantic genetic programming (SGP). Results showed that SSR significantly outperforms SGP and presents no statistical difference to GP. More importantly, they show the potential of the proposed strategy: an effective way of applying geometric semantic operators to combine different (partial) solutions, avoiding the exponential growth problem arising from the use of these operators

Crossref

Kent Academic Repository

Impact of UV radiation on the physical properties of polypropylene floating row covers

Author: Demšar A
Svetec DG
Žnidarčič D
Publication venue: 'African Journals Online (AJOL)'
Publication date: 16/10/2013
Field of study

In the intensive horticulture, various ways of protected area are used for the growth of seedlings and the cultivation of vegetables in all seasons. The easiest and the cheapest form of protected area is agrotextile, which can be laid directly over vegetable crops (row cover). Agrotextiles are nonwovens which are manufactured from textile fibres which are usually of chemical origin. Textiles, used as agrotextiles require suitable tensile strength and good permeability characteristics with no significant deterioration under the influence of weather changes and UV radiation. Properties of agrotextiles depend on the fibres made of and on the type and conditions of production. The purpose of this study was to analyse the influence of simulated sun light radiation (xenon lamp) on physical properties of polypropylene (PP) nonwoven material, which is used for the production of agrotextiles. The research showed that the properties of row cover change when radiated with UV light. Tensile, tearing and bursting properties worsen after radiation and air permeability and water vapour show little increase. The changes in the properties are a consequence of changes in fibres, molecular and supermolecular structure which is exhibited in changed fibres and consequently also nonwoven properties.Key words: Agrotextile, polypropylene, nonwovens, UV radiation, properties

AJOL - African Journals Online

Randomized Reference Classifier with Gaussian Distribution and Soft Confusion Matrix Applied to the Improving Weak Classifiers

Author: B. Bergmann
D Yekutieli
DJ Hand
F Provost
F Wilcoxon
HA David
J Demšar
James O. Berger
JR Quinlan
L Breiman
L Kuncheva
M Friedman
M Hall
M Kurzynski
Marcin Majak
Marek Kurzynski
Marina Sokolova
N Johnson
Pawel Trajdos
R Lysiak
S Garcia
T Cover
T Woloszynski
Y Freund
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 23/05/2019
Field of study

In this paper, an issue of building the RRC model using probability distributions other than beta distribution is addressed. More precisely, in this paper, we propose to build the RRR model using the truncated normal distribution. Heuristic procedures for expected value and the variance of the truncated-normal distribution are also proposed. The proposed approach is tested using SCM-based model for testing the consequences of applying the truncated normal distribution in the RRC model. The experimental evaluation is performed using four different base classifiers and seven quality measures. The results showed that the proposed approach is comparable to the RRC model built using beta distribution. What is more, for some base classifiers, the truncated-normal-based SCM algorithm turned out to be better at discovering objects coming from minority classes.Comment: arXiv admin note: text overlap with arXiv:1901.0882

arXiv.org e-Print Archive

Crossref

Classification of time series by shapelet transformation

Author: Anthony Bagnall
C Cortes
C Hoare
C Shannon
C Stransky
D Vries De
Edgaras Baranauskas
H Ding
J Demšar
J Lines
James Mapp
Jason Lines
JJ Rodriguez
Jon Hills
L Breiman
L Ye
M Bober
M Hall
N Friedman
P Duarte-Neto
S Campana
W Kruskal
Y Jeong
Z Xing
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/07/2014
Field of study

Time-series classification (TSC) problems present a specific challenge for classification algorithms: how to measure similarity between series. A \emph{shapelet} is a time-series subsequence that allows for TSC based on local, phase-independent similarity in shape. Shapelet-based classification uses the similarity between a shapelet and a series as a discriminatory feature. One benefit of the shapelet approach is that shapelets are comprehensible, and can offer insight into the problem domain. The original shapelet-based classifier embeds the shapelet-discovery algorithm in a decision tree, and uses information gain to assess the quality of candidates, finding a new shapelet at each node of the tree through an enumerative search. Subsequent research has focused mainly on techniques to speed up the search. We examine how best to use the shapelet primitive to construct classifiers. We propose a single-scan shapelet algorithm that finds the best

k

shapelets, which are used to produce a transformed dataset, where each of the

k

features represent the distance between a time series and a shapelet. The primary advantages over the embedded approach are that the transformed data can be used in conjunction with any classifier, and that there is no recursive search for shapelets. We demonstrate that the transformed data, in conjunction with more complex classifiers, gives greater accuracy than the embedded shapelet tree. We also evaluate three similarity measures that produce equivalent results to information gain in less time. Finally, we show that by conducting post-transform clustering of shapelets, we can enhance the interpretability of the transformed data. We conduct our experiments on 29 datasets: 17 from the UCR repository, and 12 we provide ourselve

Crossref

University of East Anglia digital repository

Ontology of core data mining entities

Author: A Bernstein
A Golbraikh
A Karalic
B Smith
B Smith
B Smith
C Silla
C Vens
D Demšar
D Kocev
D Kocev
D Qi
D Young
DJ Hand
F Serban
G Madjarov
G Tsoumakas
GH Bakir
H Mannila
HP Kriegel
I Slavkov
J Vanschoren
K Button
Larisa Soldatova
LN Soldatova
M Courtot
M Ford
M Žáková
MA Avery
MA Avery
MF López
O Spjuth
P Robinson
Panče Panov
Q Yang
R Caruana
R Guha
R Guha
RD King
RD King
RR Brinkman
Sašo Džeroski
T Dietterich
V Podpečan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/07/2014
Field of study

In this article, we present OntoDM-core, an ontology of core data mining entities. OntoDM-core defines themost essential datamining entities in a three-layered ontological structure comprising of a specification, an implementation and an application layer. It provides a representational framework for the description of mining structured data, and in addition provides taxonomies of datasets, data mining tasks, generalizations, data mining algorithms and constraints, based on the type of data. OntoDM-core is designed to support a wide range of applications/use cases, such as semantic annotation of data mining algorithms, datasets and results; annotation of QSAR studies in the context of drug discovery investigations; and disambiguation of terms in text mining. The ontology has been thoroughly assessed following the practices in ontology engineering, is fully interoperable with many domain resources and is easy to extend

Crossref

Brunel University Research Archive

Exploiting Anti-monotonicity of Multi-label Evaluation Measures for Inducing Multi-label Rules

Author: D Malerba
E Loza Mencía
F Charte
F Thabtah
G Bosc
G Tsoumakas
J Demšar
JL Ávila-Jiménez
K Dembczyński
M Allamanis
U Kayande
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/12/2018
Field of study

Exploiting dependencies between labels is considered to be crucial for multi-label classification. Rules are able to expose label dependencies such as implications, subsumptions or exclusions in a human-comprehensible and interpretable manner. However, the induction of rules with multiple labels in the head is particularly challenging, as the number of label combinations which must be taken into account for each rule grows exponentially with the number of available labels. To overcome this limitation, algorithms for exhaustive rule mining typically use properties such as anti-monotonicity or decomposability in order to prune the search space. In the present paper, we examine whether commonly used multi-label evaluation metrics satisfy these properties and therefore are suited to prune the search space for multi-label heads.Comment: Preprint version. To appear in: Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) 2018. See http://www.ke.tu-darmstadt.de/bibtex/publications/show/3074 for further information. arXiv admin note: text overlap with arXiv:1812.0005

arXiv.org e-Print Archive

Crossref

Combination of linear classifiers using score function -- analysis of possible combination strategies

Author: AH Ko
AS Britto
B Cyganek
B. Bergmann
C Cortes
CD Manning
D Yekutieli
E Hüllermeier
F Wilcoxon
G Giacinto
Geoffrey J. McLachlan
H Drucker
J Demšar
Karl Pearson
L Xu
L.I. Kuncheva
Luc Devroye
M Friedman
M Hall
M Przybyła-Kasperek
M Przybyła-Kasperek
M Reif
M Skurichina
M Woźniak
Marina Sokolova
S Garcia
S Holm
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 23/05/2019
Field of study

In this work, we addressed the issue of combining linear classifiers using their score functions. The value of the scoring function depends on the distance from the decision boundary. Two score functions have been tested and four different combination strategies were investigated. During the experimental study, the proposed approach was applied to the heterogeneous ensemble and it was compared to two reference methods -- majority voting and model averaging respectively. The comparison was made in terms of seven different quality criteria. The result shows that combination strategies based on simple average, and trimmed average are the best combination strategies of the geometrical combination

arXiv.org e-Print Archive

Crossref

Iso-osmotic regulation of nitrate accumulation in lettuce (Lactuca sativa L.)

Author: Abd-Elmoniem E. M
Barker A. V.
Behr U.
Breteler H.
Burns I. G.
Cantliffe D. L.
Corré W. J.
Demšar J.
Drews M.
Drews M.
Graves C. J.
Gunes A.
Houba V. J. G.
Ian G. Burns
Kefeng Zhang
Marschner H.
Marschner H.
Mary K. Turner
McCall D.
McCall D.
Mott R. L.
Nobel P. A.
Raab T. K.
Raynal Lacroix C.
Rodney Edmondson
Savvas D.
Scaife A.
Seginer I.
Seginer I.
Steingröver E. G.
Stienstra A. W.
Van Der Boon J.
Wyn Jones R. G.
Zhang K.
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2010
Field of study

Concerns about possible health hazards arising from human consumption of lettuce and other edible vegetable crops with high concentrations of nitrate have generated demands for a greater understanding of processes involved in its uptake and accumulation in order to devise more sustainable strategies for its control. This paper evaluates a proposed iso-osmotic mechanism for the regulation of nitrate accumulation in lettuce (Lactuca sativa L.) heads. This mechanism assumes that changes in the concentrations of nitrate and all other endogenous osmotica (including anions, cations and neutral solutes) are continually adjusted in tandem to minimise differences in osmotic potential of the shoot sap during growth, with these changes occurring independently of any variations in external water potential. The hypothesis was tested using data from six new experiments, each with a single unique treatment comprising a separate combination of light intensity, N source (nitrate with or without ammonium) and nitrate concentration carried out hydroponically in a glasshouse using a butterhead lettuce variety. Repeat measurements of plant weights and estimates of all of the main soluble constituents (nitrate, potassium, calcium, magnesium, organic anions, chloride, phosphate, sulphate and soluble carbohydrates) in the shoot sap were made at intervals from about 2 weeks after transplanting until commercial maturity, and the data used to calculate changes in average osmotic potential in the shoot. Results showed that nitrate concentrations in the sap increased when average light levels were reduced by between 30 and 49 % and (to a lesser extent) when nitrate was supplied at a supra-optimal concentration, and declined with partial replacement of nitrate by ammonium in the external nutrient supply. The associated changes in the proportions of other endogenous osmotica, in combination with the adjustment of shoot water content, maintained the total solute concentrations in shoot sap approximately constant and minimised differences in osmotic potential between treatments at each sampling date. There was, however, a gradual increase in osmotic potential (ie a decline in total solute concentration) over time largely caused by increases in shoot water content associated with the physiological and morphological development of the plants. Regression analysis using normalised data (to correct for these time trends) showed that the results were consistent with a 1:1 exchange between the concentrations of nitrate and the sum of all other endogenous osmotica throughout growth, providing evidence that an iso-osmotic mechanism (incorporating both concentration and volume regulation) was involved in controlling nitrate concentrations in the shoot

Crossref

Warwick Research Archives Portal Repository

Improving the k-Nearest Neighbour Rule by an Evolutionary Voting Approach

Author: A. Abraham
D. Mateos-García
E. Corchado
F. Fernandez
J. Demšar
J. Sanz
J.M. Keller
J.M. Mendel
M. Raymer
M.A. Tahir
M.L. Raymer
R. Paredes
S. García
W. Liu
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2014
Field of study

This work presents an evolutionary approach to modify the voting system of the k-Nearest Neighbours (kNN). The main novelty of this article lies on the optimization process of voting regardless of the distance of every neighbour. The calculated real-valued vector through the evolutionary process can be seen as the relative contribution of every neighbour to select the label of an unclassified example. We have tested our approach on 30 datasets of the UCI repository and results have been compared with those obtained from other 6 variants of the kNN predictor, resulting in a realistic improvement statistically supported

Crossref

idUS. Depósito de Investigación Universidad de Sevilla

Evolving an optimal decision template for combining classifiers.

Author: D Karaboga
I Barandiaran
J Demšar
J Kittler
KM Ting
LI Kuncheva
LI Kuncheva
MU Şen
P Shunmugapriya
TT Nguyen
TT Nguyen
TT Nguyen
TT Nguyen
TT Nguyen
Y Chen
ZH Zhou
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 09/12/2019
Field of study

In this paper, we aim to develop an effective combining algorithm for ensemble learning systems. The Decision Template method, one of the most popular combining algorithms for ensemble systems, does not perform well when working on certain datasets like those having imbalanced data. Moreover, point estimation by computing the average value on the outputs of base classifiers in the Decision Template method is sometimes not a good representation, especially for skewed datasets. Here we propose to search for an optimal decision template in the combining algorithm for a heterogeneous ensemble. To do this, we first generate the base classifier by training the pre-selected learning algorithms on the given training set. The meta-data of the training set is then generated via cross validation. Using the Artificial Bee Colony algorithm, we search for the optimal template that minimizes the empirical 0–1 loss function on the training set. The class label is assigned to the unlabeled sample based on the maximum of the similarity between the optimal decision template and the sample’s meta-data. Experiments conducted on the UCI datasets demonstrated the superiority of the proposed method over several benchmark algorithms

Crossref

Open Access Institutional Repository at Robert Gordon University